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PRODUCTION AND USE OF TYPE 5 17BETA-HY0R0XYSTER0ID DEHYDROGENASE 



BACKGROUND OF THE INVENTION 
Field of the Invention 

5 

The present invention relates to the isolation and characterization of a novel 
enzyme which is implicated in the production of sex steroids, and more particularly, 
to the characterization of the gene and cDNA of a novel 20oc, 17P-hydroxysteroid 
dehydrogenase (hereinafter Qrpe S 17P-HSD) whidi has been implicated in the 
10 conversion of progesterone and 4-androsteiiedione (A^-dione) to 20oc- 
hydroxyprogesterone (20oc-OH-P) and testosterone (T), respectively. The use of this 
enzyme as an assay for inhibitors of the enzyme is also described, as are several other 
uses of the DN A, fragmeitts thereof and antisense fragments thereof. 

IS Description ofAe Related Art 

The ena^mes identified as IVP-HSDs are important in the production of human 
sex steroids, including aodrost-S-ene-SPJTMiol (A^-diol), testosterone and estradiol. 
It was once thought that a single gene encoded a single ^pe of 17p-HSD which was 

20 responsible for catalyzing all of the reactions. However^ in humans, several types of 
np-HSD have now been identified and characterized. Each type of 17P-HSD has 
been found to catalyze specific reactions and to be located in specific tissues. Further 
information about Types 1, 2 and 3 17p-HSD can be had by reference as follows: 
Type 1 np-HSD is described by Luu-The« V. et al., MoL Endocrinol., 3:1301-1309 

25 (1989) and by Peltoketo, H. et al., FEES Lett, 239:73-77 (1988); Type 2 17P-HSD is 
described in Wu, L. et al., /. Biol Chem, 268:12964-12969 (1993); Type 3 17p-HSD 
is described in Geissler, WM, Nature Genetics, 7:34-39 (1994). A fourth type which 
is homologous to porcine ovarian 17P-HSD has recently been identified by researchers 
Adamski and de Launoit, however, applicant is not presently aware of published 

30 information on this type. 

The present invention relates to a fifth type of 17P-HSD which is described in 
detail below. 
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SUMM ARY OF THE INVENTION 

It is an object of the present invention to provide a novel ITp-hydroxysteroid 
5 dehydrogenase (17P-HSD) which is identified as type 5 17p-HSD. 

It is also an object of the present invention to provide a 17P-HSD which has 
been shown to be involved in the conversion of A^-dione to testosterone and in the 
convmion of progesterone to 20ac.hydroxyprogesterone (20oc-OH*P), 

It is a further object of this invention to provide the nucleotide sequences and a 
10 gene map for type 5 17p-HSD. 

It is also an object of this invention to provide methods of using type 5 17p- 
HSD in an assay to identily compounds which inhibit the activity of this enzyme, and 
thus may reduce production of testosterone or 20ac-hydioxyprogesterone, and can be 
used to treat medical conditions which respond unfavorably to these steroids. 
15 It is an additional object of this invention to provide methods of preventing the 

synthesis of type 5 17P-HSD by administerii^ an amisense RNA of the gene sequence 
of type 5 np-HSD to interfere with the translation of the gene's mRNA. 

These aiKl other objects are discussed herein. 

In particular, a novel enzyme, type 5 np-hydroxysteroid dehydrogenase, has 
20 been identified and characterized. The gene sequence for this type 5 np-HSD was 
found to encode a protein of 323 amino acids, having an apparent calculated molecular 
weight of 36,844 daltons. The protein is encoded by nucleotides +11 through 982, 
including the stop codon (amino acids +1 through 323), numbered in the 5' to 3" 
direction, in the following sequence (SEQ ID Nos. 1 and 2): 

GTGACAGGGA ATG GAT TCC AAA CAG CAG TGT GTA AAG CTA AAT GAT QGC 49 
Mel Asp Ser Lys Gbi Gin Cys Vd Lys Leu Asn Asp Gly 
1 5 10 

CACTTCATGCCTGTATTGGGATTTGGCACCTATQCACCTCCAGAGGTT 97 
His Phe Mel Pro Val Leu Qy Phe Gly Thr Tyr Ala Pio Pre Glu Vtf 
IS 20 25 

COG AGA AGT AAA GOT TTG GAG GTC ACC AAA TTA GCA ATA GAA GCT GGG 145 
ProAigSerLysAlaLeuGtuVal ThrLysLeuAlalteGhiAlaG^ 
30 35 40 45 

TTCCGCCATATAGATTCTGCTCATnATACAATAATGAGGAGCAGGTT 193 
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Phe Aig His De Asp Ser Ala His Leu Tyr Asi Asn Gu Ghi Gki Val 
50 55 GO 

GG/^CT6GCCATCCGAAGCAA6ATTGCAGATGGCAGTGTGAAGAGAGAA 241 
S QyLeuAlatleAigSerLysfleAtaAspGlySerValtysArgGhj 
65 70 75 

GACATAnCTACACTTCAAAGCnTGGTCCACTTrrCATCGACCAGAG 289 
Asp He Phe Tyr Ttv Ser Lys Leu Tip Ser Tlv Phe His Aig PiD Gb 
10 80 85 90 

TTGGTCCGACCAGCCTTGGAAAACTCACTGAAAAAAGCTCAATTGGAC 337 
LeuValAigPmAlaLeuGluAsnSerLaiLysLysAtaQnLeuAsp 
95 too 105 

15 

TATGnGACCTCTATCTTATTCATTCTCCAATGiaCTAAAGCCAGGT 385 
Tyr V» Asp Leu Tyr Leu He HiB Ser Pm Mel Ser Leu Lys Pm Giy 
110 115 120 125 

20 GAGGAACTTTCACXIAACAGATGAAAATGGAAAAGTAATATnGACATA 433 
Gki Ghi Uu Ser PiD Thr A9 Ghj Asn Qy Lys Val He Pile Asp lie 
130 135 140 

GTGGATCTCTGTACCACCTGGGAGGCCATGGAGAA6TGTAAGGATGCA 481 
2S VdlAspLeuCysTlvTtirTipGluAlaMelGhiLysCysLysAsDAb 
145 ISO 1S5 

GGATTGGCCAAGTCCATTGGGGTGTCAAACTTCAACCGCAGGCAGCTG 529 
Gly Leu Ala Lys Sa tie Gly Val Ser Asn Phe Asn AfQ Arg Gfai Laj 
30 160 165 170 

GAGATGATCCTCAACAAGCCAGGACTCAAGTACAAGCCT6TCTGCAAC 577 
Ghi Meille teu Asn Lys Pio Gly Leu Lys Tyr Lys Pre Val Cys Asn 
175 180 185 

35 

CAGGTAGAATGTCATCCGTATTTCAACCGGAGTAAATTGCTAGATTTC 625 
Gm Val Qu Cys His Pre Tyr Phe Asn Afg Ser Lys Leu Leu Asp Phe 
190 195 200 205 

40 TGCAAGTCGAAAGATATTGTTCTGGTTGCCTATAGTGCTCTGGGATCT 673 
Cys Lys Ser Lys Asp Oe Val Leu Val Ala Tyr Ser Ala Leu Gly Sef 
210 215 220 

CAACGAGACAAACGATGGGTGGACCC6AACTCCCCGGTGCTCTTGGAG 721 
45 GkiAigAspLysArgTrpValAspProAsnSerPreValLeuLeuGlu 
225 230 235 

GAC(X:AGTCOTTGTGCX:nGGCAAAAAAGCACAAGaSAACCCCAGCC 769 
Asp Pre Val Leu Cys Ala Leu Ala Lys Lys His Lys Aig Thr Pre Ala 
50 240 245 2S0 

CTGATTGCCCTGCGCTACCAGCTGCAGCGTGGGGnGTGGTCCTGGCC 817 
Leu He Ala Leu Arg Tyr Gbi Leu Gin Afg Gly Vrf Val Val Leu Ala 
255 260 265 

55 

AAGAGCTACAATGAGCAGCGCATCAGACAGAACGTGCAGGnTTTGAG 865 
Lys Ser Tyr Asn Gtu Gtn Arg lie Aig Gfri Asn Val Gbi Val Phe Gtu 
270 275 280 285 
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TTCCAGTTGACTGCAGAGGACATGAAAGCCATAGATGGCCTAGACAGA 913 
Phe Gin Leu Thr Ala Glu Asp Met Lys Ala lie Asp Qy Lbu Asp Aig 
290 295 300 

5 

AATCTCCACTATrrTAACAGTGATAGTmGCTAXCACCCTAATTAT 961 
Asn Leu His Tyr Phe Asn Ser Asp Ser Phe Ala Ser His Pro Asn Tyr 
305 310 315 

10 CCATATTCAGATGAATATTAACATGGAGACTTTGCXrTGATGATGTCTACCA 1012 
Pro Tyr Ser Asp Gil Tyr * 
320 

GAAGGCCCTGTGTGTGGATGGTGACGCAGAGGACGTCraATGCCGGTGACTQGACATAT 1072 
CAarrCTACTTAAATCCGTCCTGmAGCGAm(>«5TCAAaADtf3CTC^ 1132 
CCAGAAATACAATAAATCCTGmAGCGACTTCAGTCAACTACAQCTCACTCC^^ 1192 
20 AGAAATACAATAAA 1206 

In addition, a complete gene map (Figure 5) and nucleotide sequences (SEQ. ID Nos. 
3 through 29 and Figures 6A and 6B) of the chromosomal DNA of type 5 I7P-HSD 
are provided. A more detailed description of the sequeitt:es will be provided infia. 

2S The present invention includes methods for the synthetic production of type S 

np-HSD. as well as peptides that are biologically functionally equivalent, and to 
methods of using these compounds to screen test compounds for their ability to inhibit 
or alter the enzymatic function. In addition, methods of producing anrisense 
consuiicis to the type 5 17p-HSD gene s DNA or mRNA or portions thereof, and the 

30 use of those aiuisense constructs to interfere with the transcription or translation of the 
enzyme are also provided. 

The nucleotide sequence which encodes type S 17P-HSD and recombinant 
expression vectors which include the sequence may be modified so long as they 
continue to encode a functionally equivalent enzyme. Moreover, it is comemplated, 

35 within the invention, that codons within the coding region may be altered, inter alia, 
in a maimer which, given the degeneracy of the genetic code, continues to encode the 
same protein or one providing a functionally equivalent protein. It is believed that 
nucleotide sequences analogous to SEQ ID No. 1, or those that hybridize under 
stringent conditions to the coding region of SEQ ID No. 1. are likely to encode a type 

40 5 1711-HSD functionally equivalent to that encoded by the coding region of SEQ ID 
No. 1, especially if such analogous nucleotide sequence is at least 700, preferably at 
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least 850 and most preferably at least 969 nucleotides in length. As used herein, 
except where otherwise specified, "stringent conditions'* means O.lx SSC (0.3 M 
sodium chloride and 0.03M sodium citrate) and 0.1% sodium dodecyl sulphate (SDS) 
and 60" C. 



which tissues or cells have the enzymatic machinery to convert AMione to 
testosterone* or to conven progesterone to 20ac4iydroxyprogesieroitB, include a type 5 
np-HSD sufficiently analogous to human type S 17P-HSD to be used m accordance 
with the present invention. In particular, cDNA libraries prepared from ceils 

10 performirig the foregoing conversions may be screening with probes in accordance 
with well known techniques prepared by reference to the nucleotides disclosed herein, 
and imder varyii^ degrees of stringency, in order to idemify analogous cDNAs in 
other species. These analogous cDNAs are preferably at least 70% homologous to 
SEQ ID No. 1, more preferably at least 80% homologous, and most preferably at 

15 least 90% homologous. They preferably include stretches of perfect identity at least 
10 nucleotides long, more preferably stretches of IS, 20 or even 30 nucleotides of 
perfect identity. A|q>ropriate probes may be prepared from SEQ ID No. 1 or 
fragments thereof of suiuible length, preferably at least IS nucleotides in length. 
Confirmation with at least two distinct probes is preferred. Alternative isolation 

20 strategies, such as polymerase chain reaction (PGR) amplification, may also be used. 

Homologous type S 17P-HSDs so obtained, as well as the genes encoding 
them, are used in accordance with the invention in all of the ways for using SEQ ID 
No. 2 and SEQ ID No. 1, respectively. 

Recombinam expression vectors can include the entire coding region for 

25 human type 5 17P-HSD as shown in SEQ ID No. 1, the coding region for hiunan type 
5 17p-HSD whidi has been modified, portions of the coding region for human type 5 
17P-HSD, the chromosomal DNA of type S 17P-HSD, an antisense construct to type 
5 np-HSD, or ponions of antisense constructs to type S 17p-HSD. 



30 exists in nature, but does not require purification from a natural source. Isolated 
nucleotides encoding type 5 17P-HSD may be produced synthetically, or by isolating 
cDNA thereof from a cDNA library or by any of numerous other methods well 
understood in the art. 



S 



It is also likely that tissues or cells from human or non-human sources and 



In the context of the invention, "isolated" means having a higher purity than 



SUBSTITUTE SHEET <RULE 26) 



wo 97/11162 JPCT/CA96/00605 



-6- 

In one embodiment, the invention provides an isolated nucleotide sequemre 
encoding type 5 17p-hydroxysteroid dehydrogenase, said sequence being sufficiently 
homologous to SEQ ID No. 1 or a complement thereof, to hybridize imder stringent 
conditions to the coding region of SEQ ID No. 1 or a conq)lemem thereof and said 
5 sequence encoding an enzyme which catalyzes the conversion of progesterone to 20oc- 
hydroxyprogesterone and the conversion of 4-androstenedione to test€»terone. 

In a further anbodiment, the invention provides an isolated nucleotide 
sequence comprising at least ten consecutive nucleotides identical u> 10 consecutive 
nucleotides in the coding region of SEQ ID No. 1 , or the complement thereof. 
10 In an additional embodiment, the inveittion provides an oligonucleotide 

sequence selected from the group consisting of SEQ ID Nos. 30 through 59. 

In another embodiment, the invention provides a recombinant expression 
vector comprisii^ a promoter sequence and an oligonucleotide sequoice selected from 
the group of SEQ ID Nos. 30 to 59. 
IS In a further embodiment, the invention provides a method of blocking synthesis 

of type 5 np-HSD, comprising the step of introducii^ an oligonucleotide selected 
from the group consisting of SEQ ID Nos. 30 to 59 into cells. 

In an additional embodiment, the invemion provides an isolated chromosomal 
DNA fragment which upon transcription and translation etx:odes type 5 17P- 
20 hydroxysteroid dehydrogenase and wherein said fragment contains nine exons and 
wherein said frago^nt includes inlrons which are 16 kilobase pairs in length. 

In another embodiment, the invemion provides an isolated DNA sequence 
encoding type 5 17|)*hydroxysteroid dehydrogenase, said sequence being sufficiently 
homologous to SEQ ID No. 3 or a complemem thereof, to hybridize under stru^em 
25 conditions to SEQ ID No. 3, or its complement. 

In a further embodin^nt, the invention provides a method for producing type 5 
np-hydroxysteroid dehydrogenase, comprising the steps of preparing a recombinam 
host transformed or transfected with the vector of claim 3 and culmring said host 
under conditions which are conducive to the production of type 5 17P-hydroxysteroid 
30 dehydrogenase by said host. 

In an additional embodiment, the invention provides a method for determining 
the inhibitory effect of a test compound on the enzymatic activity of type 5 17P- 
hydroxysteroid dehydrogenase, comprising the steps of providing type S 17P- 
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hydroxysteroid dehydrogenase; comacting said type 5 17p-hydroxysteioid 
dehydrogenase with said test conqKiund; and thereafter deiennining the enzymatic 
activity of said type 5 17P-hydroxysteroid dehydrogenase in the presence of said test 
coix4X>und. 

S In an additional embodiment, the invention provides a method of interfering 

with the expression of type 5 17P-hydroxysteroid dehydrogenase, comprising the step 
of administering nucleic acids substantially identical to at least IS consecutive 
nucleotides of SEQ ID No. 1 or a complemeiu thereof. 

In a further eniboduneiit, ttere is provkled a method of interferii^ with the 
10 synthesis of type 5 17P-hydroxysteroid dehydrogenase, comprising the step of 
administering antisense RNA conq>lementary to mRNA encoded by at least IS 
consecutive luicleotides of SEQ ID No. 1 or a conq>lement thereof. 

In an additional embodiment, the invention provides a method of interfering 
with the expression of type 5 17p-hydroxysteroid dehydrogenase, comprising the step 
IS of administering nucleic acids substantially identical to at least IS consecutive 
nucleotides of SEQ ID No. 3 or a complement thereof. 

In another embodiment, the invention provides a method of imerfering with the 
synthesis of type 5 17p-hydroxysteroid dehydrogenase, comprising the step of 
administering antisense RNA conq)lementary to mRNA encoded by at least IS 
20 consecutive nucleotides of SEQ ID No. 3 or a complement thereof. 

In a further embodiment, there is provided a method for determining the 
inhibitory effect of antisense nucleic acids on the enzymatic activity of type 5 17p- 
hydroxysteroid del^drogenase, comprising the stsps of providing a host system 
capable of expressing type S 17p-hydroxysteroid dehydrpgmase; introducing said 
2S antisense nucleic acids iitto said host S3rstem; and thereafter detemiining the enzymatic 
activity of said type S I7P-hydroxysteroid dehydrogenase. 

Other features and advantages of the present invention will become apparent 
from the following description of the invention which refers to the accompanyii^ 
drawings. 

30 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figiues 1 A and IB are graphs showing the enzymatic activities of Type S I7|)- 
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HSD on various substrates. The enzyme was expressed in embryonal kidney (293) 
cells (ATCC CRL 1573) which were transfected with a vector, prepared in accordance 
with the invention, and containing the gene encoding human type 5 17P-HSD. Figuie 
lA shows the substrate specificity of type 5 17P-HSD, The concentration of each 
substrate was 0.1 fiM. Figure IB shows the time course amount of 20oc-HSD aiKl 
17p-HSD activities of cells transfected with vectors containing human type 5 17p- 
HSD. The substrates, progesterone (P) and AMione, were added at a concentration 
of 0.1 mM; 

Figure 2 is a map of a pCMV vector which is exemplary of one that can be 
used to transfect host cells in accordance with the invention; 

Figure 3 is the cDNA sequence (SEQ ID No. 1) and the deduced amino acid 
sequence (SEQ ID No. 2) of human type 5 17P-HSD. The nucleotide sequeixre is 
numbered in the 5' to 3' direction with the adenosine of the initiation codon (ATG) 
designated as +11. The translation stcq> codon is indicated by asterisks. The 
potential post modification sites are underlined, wherein TSK « tyrosine sulfokinase; 
CK2 = casein kinase 11; PRC = protein kinase C; NG = Ii-«lycosylation; and NM 
= N-myrystoylation; 

Figure 4 is a comparison of the deduced amino acid sequence of human type 5 
np-HSD to the amino acid sequences of rabbit (rb), rat (r), and bovine (b) 20ac-HSD 
as well as human (h) and rat (r) 3oc-HSD, bovine prostaglandin f synthase (b pgfs) and 
frog p-crystallin (f p-crys). The amino sequences are indicated usi^g the conventional 
single letter code and are numbered on the right. The dashes (-) and dots (.) indicate 
identical and missing amino acid residues, respectively; 

Figure 5 is a map of the chromosomal DNA of a gene which encodes type 5 
17P-HSD; and 

Figures 6A and 6B are the nucleotide sequence of the chromosomal DNA of a 
gene which encodes type 5 17P-HSD. 
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DETAILED DESCRIPTION OF THE INVENTION 



A gene enccxitng the ehzymie, type 5 17P-HSD, has been isolated and encodes 



5 a protein having 323 amino acids with a calculated molecular weight of 36,844 
daltons. As shown in Figure 3, the coding ponion of this gene includes nucleotides 
+ 11 through 982, including the stop codon (and encodes amino acids +1 through 
323), numbered in the 5' to 3* direction. 



10 been characterized. A m^ of the gene is provided in Figure S. In particular* it was 
fouiKi, using primer extension analysis, that the gene includes 16 kilobase pairs (kb) 
and contaimd nine shon exons. A portion of the 5* flanking region, as set forth in 
SEQ ID No. 3, of the genomic DNA includes 730 base pairs (bp). Exon I (SEQ ID 
No. 4) contains 37 nucleotides in the S'-noncoding region and the nucleotides for the 

15 fust 28 amino acids. The second imron region includes the micleotides set forth in 
SEQ ID Nos. 5 and 6, which are 2S2 and 410 bp, respectively. These are joined by a 
1.2 kb region which is not important and therefore, its sequence has been omitted. 
Exon 2 (SEQ ID No. 7) contains nucleotides for the following 56 amino acids of 
huTDzn type 5 17P-HSD. The following intron region iiKludes SEQ ID Nos. 8 and 9, 

20 700 and 73 bp, respectively, which are joined by a 0.1 kb region for which the 
sequence has not been provided. Exon 3 (SEQ ID No. 10) includes the next 117 
nucleotides which specify the following 39 amino acids. The fourth intron region is 
represented by SEQ ID Nos. 11 and 12, 152 and 208 nucleotides in lei^th, 
respectively, with a 0.9 kb region in between which has not been provided. Exon 4 

25 (SEQ ID No. 13) includes the next 78 bp which specify the following 26 amino acids 
of the enzyme. Intron region five contains SEQ ID Nos. 14 and 15, with 98 and 249 
nucleotides, respectively, with a 0.1 kb region in the middle which has not been 
provided. The fifth exon (SEQ ID No. 16) contains nucleotides for the following 41 
amino acids of human type 5 17p-HSD. The sixth intron region, set forth in SEQ ID 

30 Nos. 17 and 18 with 138 aiKl 189 bp, respectively, also includes a 2.8 kb region 
which has not been provided. Exon 6 (SEQ ID No. 19) contains nucleotides for the 
following 36 amino acids of type 5 17p-HSD, as well as two nueleotides of the codon 
227 (Trp). The next intron region includes a 136 bp ponion (SEQ ID No. 20) and a 



The chromosomal DNA fragment of the gene for type 5 17p-HSD has also 
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66 bp portion (SEQ TO No. 21) which are joined by a O.l kb region which is not set 
forth. Exon 7 (SEQ ID No. 22) contains nucleotides for the third nucleotide of codon 
227 (Trp) and nucleotides for the following 55 codons. The following intron region 
includes a 136 nucleotide region (SEQ ID No. 23), a 2.5 kb region which is not 
provided and a 286 bp region (SEQ ID No. 24). Exon 8 (SEQ ID No. 25) includes 83 
nucleotides which code for the following 27 amino acids and 2 nucleotides of codon 
310. The ninth imron region contains 713 nucleotides (SEQ ID No, 26) followed by a 
1 kb region whkh has not been provided folk>wed by a 415 nucleotide region (SEQ 
ID No. 27). Exon 9 (SEQ ID No. 28) contains the third nucleotide of codon 310, 42 
nucleotides for the last 13 amino acids and a stop codon and approximately 200 
nucleotides in the 3 '-untranslated region. A polymorphic (GT)„ repeat region that can 
be used to perform genetic linkage mapping of the type 5 17p-HSD can be found 255 
nucleotides downstream from the TAA stop codon. SEQ ID No. 29 sets forth 109 bp 
of additional genomic sequence. The nucleotide sequence of the gene fragment, as 
described above, is provkied in Figures 6A and 6B. 

The type 5 17p-HSD enzyme can be produced by iiux)rporating the nucleotide 
sequence for the coding portion of the gene into a vector which is then transformed or 
transfected into a host system which is capable of expressing the enzyme. The DNA 
can be maintained u^ieraly in the host or can be stably integrated into the genome of 
the host cell. In addition, the chromosomal DNA can be incorporated into a vector 
and transfected into a host system for cloning. 

In particular, for the cloning and expression of type 5 17p-HSD. any common 
expression vectors, such as plasniids, can be used. These vectors can be prokaryotic 
expression vectors including those derived from bacteriophage k such as >wgtll and 
A.EMBL3, £. coli strains such as pBR322 and Bluescrqit (Stratagehe): or eukaryotic 
vectors, such as those in the pCMV family. Vectors incorporating an isolated human 
cDNA shown in Sequence ID No. 1 (ATCC Deposit No. ) and the chromosomal 
DNA as shown in Sequence ID Nos. 3 through 29 (ATCC Deposit No. ) for type 5 
17P-HSD have been placed on deposit at the American Type Culture Collection 
(ATCC, Rockville. MD), in accordance with the terms of the Budapest Treaty, and 
will be made available to the public upon issuance of a patent based on the present 
patem application. 

These vectors generally include appropriate replication and control sequences 
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which are compatible with the host system imo which the vectors are transfected. A 
promoter sequence is generally included. For prokaryotes, some representative 
promoters include ^'lactamase, lactose, and tryptophan. In mammalian cells* 
commonly used promoters include, but are not limited to, adenovirus, 
5 cytomegalovirus (CMV) and simian virus 40 (SV40). The vector can also optionally 
include, as appropriate, an origin of replication, ribosome binding sites, RNA splice 
sites, polyadenylation sites, transcriptional termination sequences and/or a selectable 
marker. It is well understood that there are a variety of vector systems with various 
characteristics which can be used in the practice of the invention. A map of the 

10 pCMV vector, which is an example of a vector which can be used in the practice of 
the invention, is provided in Figure 2. 

Commonly known host systems which are known for expressing an enzyme, 
and which may be transfected with an appropriate vector which includes a gene for 
Type 5 17(J-HSD can be used in the practice of the invention. These host systems 

15 include prokaryotic hosts, such as £. coii, bacilli, such as Bacillus subtilus^ aixi other 
enterobacteria, such as Salmonella^ Serratia, and Pseudomonas species. Eukaryotic 
microbes, inchiding yeast cultures, can also be used. The most common of these is 
Saccharomyces cerevisiae. although other species are commercially available and can 
be used. Furthermore, cell cultures can be grown which are derived from mammalian 

20 cells. Some examples of suitable host cell lines include embryonal kidney (293), SW- 
13, Chinese hamster ovary (CHO), HeLa. myeloma. Jurkat. COS-1. BHK, W138 and 
madin-darby canine kidney (MDCK). In the practice of the invention, the 293 cells 
are preferred. 



25 purified from nature, or otherwise produced, can be used in assays to identify 
compounds which inhibit or alter the activity of the enzyme. In particular, since type 
5 17P-HSD is shown to catalyze the conversion of progesterone to 20ac-OH-P and the 
conversion of A*-dione to testosterone, this enzyme can be used to identify compounds 
which interfere with the production of these sex steroids. It is preferred that the 

30 enzyme be obtained directly from the recombinant host, wherein following expression, 
a crude homogenate is prepared which includes the enzyme. A substrate of the 
enzyme, such as progesterone or A^-dione and a compound to be tested are then mixed 
with the homogenate. The acrivi^ of the enzyme with and without the test compound 



Type 5 17p-HSD, whether recombiiuintly produced as described herein. 
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is conq>ared. Numerous methods are known which can be used to iiKlicate the effects 
of the test con^und on the activity of the substrate for easy detection of the relative 
amounts of substrate and produa over time. For example, it is possible to label the 
substrate so that the label also stays on any product that is formed. Radioactive labels. 
5 such as C* or H\ which can be quantitatively analyzed are particularly usehil. 

It is preferred that the mixture of the enzyme, test compound and substrate be 
allowed to incubate for a predetemuned amoum of time. In addition, it is preferred 
that the product is separated from the substrate for easier analysis. A number of 
separation techniques are known, for exanq)le, thin layer chromatognq>hy (TLC), high 

10 pressure liquid chromatography (HPLC), spectrophotomeny, gas chromatography, 
mass spectrophotometry and nuclear magnetic resonance (NMR). However, any 
known method which can differentiate between a substrate and a product can be used. 

It is also contemplated that the gene for type 5 17P-HSD or a portion thereof 
can be used to produce antisense nucleic acid sequences for inhibiting expression of 

IS Type 5 I7p-HSD in viva. Thus activity of the enzyme and levels of its products (e.g. 
testosterone) may be reduced where desirable. In general, antisense nucleic acid 
sequences can interfere with transcription, splicii^ or translation processes. Antisense 
sequences can prevent transcr^tion by forming a triple helix or hybridizing to an 
opened loop which is created by RNA polymerase or hybridizing to nascent RNA. 

20 On the other hand, splicing can advantageously be interfered with if the antisense 
sequences bind at the intersection of an exon and an intron. Finally, translation can be 
affected by blocking the bindii^ of initiation factors or by preventing the assembly of 
ribosomal subunits at the start codon or by blocking the ribosome from the coding 
ponion of the mRNA, preferably by using RNA that is aiuisense to the message. For 

25 farther general information, see H£ldne et al., Biochimica et Biophysica Aaa, 
1049:99-125 (1990), which is herein incorporated by reference in its entirety. 

An antisense nucleic acid sequence is an RNA or single stramled DNA 
sequence which is complementary to the target ponion of the target gene. These 
antisense sequences are introduced into cells where the complementary surand base 

30 pairs with the target portion of the target gene, thereby blocking the transcription, 
splicing or translation of the gene and eliminating or reducing the production of type 5 
np-HSD. The length of the antisense nucleic acid sequence need be no more than is 
sufficiem to interfere with the transcription, splicing or translation of ftinctional type 5 
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np-HSD. Antisense strands can range in size from 10 nucleotides to the complete 
gaie» however, about 10 to SO nucleotides are preferred, and IS to 25 nucleotides are 
most preferred. 

Although it is contemplated that any portion of the gene could be used to 
5 produce antisense sequences, it is preferred that the antisense is directed to the coding 
ponion of the gene or to the sequence around the translation initiation site of the 
mRNA or to a portion of the promoter. Some examples of specific antisense 
oligonucleotide sequences in the coding region which can be used to block type 5 
HSD synthesis inchicte: TTTAGCTTTACACACTGCTGTT (SEQ ID No. 30); 

10 TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31); GATGAAAAGTGGACCA 
(SEQ ID No. 32); ATCTGTTGGTGAAAGTTC (SEQ ID No. 33); 
TCCAGCTGCXTTGCGGT (SEQ ID No, 34); CTTGTA(nTGAGTCCTG (SEQ ID 
No. 35); CTCCGGTTGAAATACGGA (SEQ ID No. 36); 
CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37); 

15 TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38); ATCTGAATATGGATAAT 
(SEQ ID No. 39). Examples of antisense oligonucleotide sequences which can block 



the slicing of the type S 


17P-HSD 


premessage 


are as 


follows: 


TTCTCGGAACCTGGAGGAGC 


(SEQ 


m 


No. 


40); 


GACACAGTACCTTTGAAGTG 


(SEQ 


ID 


No. 


41): 


TGGACCAAAGCTGCAGAGGT 


(SEQ 


ID 


No. 


42); 


CCTCACCTGGCTGAAATAGA 


(SEQ 


ID 


No. 


43); 


AAGCACTCACCTCCCAGGTG 


(SEQ 


ID 


No. 


44); 



GACATTCTACXrrGCAGTTGA (SEQ ID No. 45); CTCAAAAACCTATCAGAAA 
(SEQ ID No. 46); GGAAACTTACCTATCACTGT (SEQ ID No. 47); 

25 GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48). Exanq)les of antisense 
oligonucleotide sequences which inhibit the promoter activity of type S 17p-HSd 
include: GAGAAATATTCATTCTG (SEQ ID No. 49); 

CGAGTCCTGATAAAGCTG (SEQ ID No. 50); GATGAGGGTGCAAATAA (SEQ 
ID No. 51); GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52); 

30 CAGAGATTACAAAAACAAT (SEQ ID No. 53); 



T<jCCrrrriACATTTTCAATCA (SEQ ID No. 54); ACACATAATTTAAAGGA 
(SEQ ID No. 55); TTAAATTATTCAAAAGG (SEQ ID No. 56): 



AAGAGAAATATTCATTTCTG 



(SEQ 



ID 



No. 



57): 
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CCCCTCCCCCCACCCCTGCA (SEQ ID No. 58); CTGCCGTGATAATGCCCC 
(SEQ ID No. 59). 

As is well understood in the an, the oligonucleotide sequences can be modified 
in various manners in order to increase the effectiveness of the treatment with 
5 oligonucleotides. In panicular, the sequences can be modified to include additional 
RNA to the 3' end of the RNA which can form a hairpin-loop stnicture and thereby 
prevent degradation by nucleases. In addition, the chemical ludcages in the backbone 
of the oligonucleotides can be modified to also prevent cleavage by nucleases. 

There are numerous methods which are known in the art for introducing the 

10 amisense strands into cells. One strategy is to incorporate the gene which encodes 
type 5 17P-HSD in the opposite orientation in a vector so that the RNA which is 
transcribed from the plasmid is con^jlen^mary to the nuRNA transcribed from the 
cellular gene. A strong promoter, such as pCMV. is generally included in the vector, 
upstream of the gene sequence, so that a large amount of the amisense RNA is 

15 produced and is available for binding sense mRNA. The vectors are then transfected 
into cells which are then administered. It is also possible to produce single sminded 
DNA oligonucleotides or antisense RNA and incoiporate these mto cells or liposomes 
which are then administered. The use of liposomes, such as those described in 
WO95/03788, which is herein incorporated by reference, is preferred. However, 

20 other methods which are well understood in the art can also be used to introduce the 
amisense strands imo cells and to admmister to these patients in need of such 
treatment. 

The following is an example of the expression of human type 5 17p-HSD. 
This example is intended to be illustrative of the mvention and it is well understood by 
25 those of skill in the art that modifications, alterations and differem techniques can be 
used within the scope of the invention. 

Expression of 
20gc, 17P-IISD (Type 5 17P-HSD) 

30 

Construaion of the expression veaor and nucleotide sequence determination 

The phage DNA were digested with EcoRI restriction enzyme and the resulting 
cDNA fragmems were insened in the EcoRI site downsueam to the cytomegalovirus 



SUBSTITUTE SHEET (RULE 26) 



wo 97/11162 




/CA96/00605 



-15- 

(CMV) promoter of the pCMV vector as shovm in Figure 2, The recombinant pCMV 
plasmids were amplified in Escherichia coli DHSa competent cells, and were isolated 
using the alkaline lysis procedure as d^ribed by Maniatis in Molecular Cloning: A 
Laboratory Manual (Cold Spring Harbor Press 1982). The sequencing of double- 
5 stranded plasmid DNA was performed according to the dideoxy chain termination 
method described by Sanger F. el al., Proc. NatL Acad, ScL, 74:5463-5467 (1977) 
using a T7 DNA polymerase sequencing kit (Pharmacia LKB Biotechnology). In 
order to avoid errors, all sequerces were determined by sequencing both strands of the 
DNA. The oligomicleodde primers were synthesized using a 394 DNA/RNA 

10 synthesizer (Applied Biosystem). 

As shown in Figure 2« the pCMV vector contains 582 nucleotides of the 
pCMV promoter, followed by 74 nucleotides of unknown origin which includes the 
EcoRI and Hindm sites, followed by 432 basepairs (bp) of a small t intron (fragment 
4713 - 4570) and a polyadenyladon signal (fragment 2825 - 2536) of SV40, followed 

15 by 156 nucleotides of unknown origin, followed by 1989 bp of the PvuII (628) to 
Aatll (2617) fragment from the pUC 19 vector (New England Biolabs) which contains 
an E. coli origin of replkation and an ampicillin resistance gene for propagation in £. 
coli. 

20 Transient expression in transformed embryonal kidney (293) cells 

The vectors were transfected using the calcium phosphate procedure described 
by Kingston. R.E., In: Current Protocols in Molecular Biology, Ausubel et al. eds.« 
pp. 9.1.1 - 9.1.9. John Wiley & Sons, N.Y. (1987) and used 1 to 10 ^g of 
recombinam plasmid DNA per 10^ cells. The total amount of DNA is kept at lO^g of 

25 plasmid DNA per 10^ cells by coiiq)leting with pCMV plasmid without insert. The 
cells were initially plated at 10^ cells/cm^ in Falcon® culture flasks and grown in 
Dulbecco's modified Eaglets medium containing 10% (vol/vol) fetal bovine serum 
(hyclone, Logan, UT) under a humidified atmosphere of air/CO^ (95%/5%) at 37°C 
and supplemented with 2 mM L-glutamine, 1 mM sodium pyruvate. 100 lU 

30 penicillin/ml, acxl 100 fig streptomycin sulfate/ml. 

Assay of enzymatic activity 

The determination of enzymatic aaivity was performed as described by Luu- 
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The el al.. Biochemistry, 13:8861-8865 (1991) which is herein incorporated by 
reference. See also Lachance et al., J. BioL Chem,. 265:20469 - 20475 (1990). 
Brieny, 0.1 of the indicated "'C-labeled substrate (Dupont Inc. (Canada)), namely, 
dehydroepiandrosterone (DHEA), 4-androsiene-3,17-dione (A*-dione), testosterone 
5 (T), estrone (El), estradiol (E2), dihydrotestosterone (DHT), and progesterone 
(PROG), was added to freshly changed culnire medhim in a 6'Well culture plate. 
After incubadon for 1 hour, the steroids were extracted twice with 2 ml of ether. The 
organic phase was pooled and evaporated to dryness. Tte steroids were solubilized in 
50 ^l of dichloromethane, an>iied to a Silica gel 60 thin layer chromatography (TLQ 

10 plate (Merck, Darmstad, Germany) and then separated by migration in the toluene- 
acetone (4:1) solvent system (Luu-The, V. et al., /. Invest. DermawL. 102:221-226 
(1994) which is herein incorporated by reference). The substrates and metabolites 
were identified by comparison with reference steroids, revealed by autoradiography 
and quantitated using the Phosphoimager System (Molecular Dynamics, Sunnyvale, 

15 CA). 

Cloning of the type 5 17fl-HSD genomic DNA clone 

The hybridization and sequencing n^thods were as described above and as 

previously described (Luu-Tl» et al., MoL Endocrinol. . 4:268-275 (1990); Luu-The 
20 et al., DNA and Cell BioL. 14:511-518 (1995); Lachance et al., /. BioL Chem.. 

265:20469-20475 (1990); Lachance et al., DNA and Cell BioL 10:701-711 (1991): 

Bemier et aL, J. BioL Chem. 2W, 28200-28205, (1994) which arc herein 

incorporated by reference). 

About 20 recombinant clones which gave the strongest hybridization signal 
25 were selected for second and third screening in order to isolate a single phage plaque. 
The two longest clones diat hybridized with specific oligonucleotides probes located 

at the 5* and 3' regions of type 5 np-HSD, respectively, were selected for mapping, 

subcloning and sequencing. As shown in Figures 5 and 6, the gene is included in 

approximately 16 kilobase pairs of introns and contains 9 short exons. A primer 
30 extension analysis using oligoprimer CAT-CAT-TTA-GCT-TTA-CAT-ACT-GCT-G 

located at positions 13 to 21. indicates that the start site is situated 37 nucleotides 

upstream from the ATG initiating codon. 
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The sites and signatures in the primary protein sequence were detected using 
PC/Gene software (Intelli Genetics Inc., Mountain View, CA). This analysis revealed 
a potential N-glycosylation site at Asn-198; five protein kinase C sites at Ser-73, Thr- 
82, Ser-102, Ser-121, and Ser-221; five casein kinase n phosphorylation sites at Ser- 
5 129, Thr-146, Ser-221, Ser-271, and Thr-289; two N-myristoylation sites at GIy-lS8 
and Gly-298; a tyrosine sulfatation site at Tyr-55; an aldo/keto reductase family 
signature 1 (25) at amino acids 158 to 168 and an aldo/keto reductase family putative 
active site signature at amino acids 262 to 280. 

As described above, the enzymatic activity of the type 5 17P-HSD was 

10 evaluated by transfecting 293 cells with vectors which included the gene encodixig 
himian type 5 17P-HSD. The ability of the enzyme to catalyze the transformation of 
progesterone (P) to 20ac.bydroxyprogesterone (20oc-OH-P), 4-androstenedione (A*- 
dione) to testosterone (T), 5oc-androstane>3,17-dione (A-dione) to dihydrotestosterone 
(DHT), dehydroepiandrosterone (DHEA) to 5-androsiene-3p,17p-diol, and estrone 

15 (El) to estradiol (E2) was analyzed. As shown in Figure lA, the enzyme possesses 
high reductive 20oc-HSD activity, wherein progesterone (?) is transformed to the 
inactive 20oc-OH-P, and 17p-HSD activity, wherein A^-dione is converted to 
testosterone (T). However, 3oc-HSD activity which is responsible for the 
transformation of DHT to 5a-androstane-3a,17P-diol is negligible. The ability of this 

20 enzyme to transform El and E2 was also negligible (Figure lA). Figure IB shows 
that the 20oc-HSD and 17p-HSD activities increased over time. 

The isolated amino acid sequence of human type 5 17p-HSD was also 
compared with rabbit 20oc-HSD (rb), rat 20ac-HSD (r), human 3oc-HSD (h), rat 3oc- 
HSD (r), bovine prostaglandin f synthase (b pgfs), frog p-crystallin (f p-crys) and 

25 human type 1 and type 2 17P-HSDs (h) as shown in Figure 4. These sequences show 
76.2%, 70.7%, 84.0%. 68.7%, 78.3%, 59.7%, 15.2% and 15.0% identity with type 
5 17P-HSD, respectively. 

Although the present invention has been described in relation to particular 
embodiments thereof, many other variations and modifications and other uses will be 

30 apparent to those skilled in the art. 
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:i) GENERAL INFORMATION: 



(i) APPLICANT: LOO«THE, Van 

lABRIE, Fernand 

(ili) NUMBER OF SEQUENCES: 59 
(iv) CORRESPONDENCE ADDRESS: 

(c! «??rNei';Lr'""* '^"^^"^ 

{D) STATE: NY 

(E) COUNTRY: US 

CD ZIP: 10036-8403 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

9c COMPUTER: IBM PC compatible 

IS! 2^I55HS^ ^^^^^^ PC-Dol/Al-DOS 

(D) SOFTWARE: PatentIn Release il.o. Version #1.30 

(vi) CURRENT APPLICATION DATA: 
-in ^'^^ APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION* 
4e <A) NAME: Meilznan, Edward" 

<B) REGISTRATION NUMBER: 24,735 

(C) REFERENCE/DOCKET NUMBER: P/1259-313 



(ix) TELECOMMUNICATION INFORMATION- 

(A) TELEPHONE: (212) 382-0700 

(B) TELEFAX: (212) 382-0888 

(C) TELEX: 236925 



:2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1206 base pairs 

(B) TYPE: nucleic acid 
cn (C) STRANDEDNESS: sinale 
^ (D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 11.. 982 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: I: 

GTGACAGGGA ATG GAT TCC AAA CAG GAG TGT GTA AAG CTA AAT GAT CGC 49 
Met Asp Ser Lys GXn Gin Cys Val Lvs Leu Asn Asp Gly 
i 5 * 10 



CAC TTC ATG OCT GTA TTG GGA TTT OGC ACC.TAT GCA OCT CCA GAG GTT 97 

in ^''^ <5^y AI* Pro Pro Glu Val 

*u 15 20 25 

^ AGT AAA GCT TTG GAG GTC ACC AAA TTA GCA ATA GAA GCT GGG 145 
Pro Arg Ser Lys Ala Leu Glu Val Thr Lys Leu Ala He Glu Ala Gly 
15 ^° 35 40 45 

TTC CGC CAT ATA GAT TCT GCT CAT TTA TAC AAT AAT GAG GAG CAG GTT 193 
Phe Arg Hxs He Asp Ser Ala His Leu Tyr Asn Asn Glu Glu Gin Val 
50 55 go 

20 GGA CTG GCC ATC CGA AGC AAG ATT GCA GAT GGC AGT GTG AAG AGA GAA 241 
Gly Leu Ala lie Arg Ser Lys He Ala Asp Gly Ser Val Lys Arg Glu 
65 70 75 

tT^ TAC JCT TCA AAG CTT TGG TCC ACT TTT cat CGA CCA gag 2B9 

25 Asp He Phe Tyr Thr Ser Lys Leu Trp Ser Thr Phe His Arg Pro Glu 
»0 85 90 

SI? S?^ ^ ^ GCT »A TO 337 

Leu Val Arg Pro Ala Leu Glu Asn Ser Leu Lys Lys Ala Gin Leu Asp 
M 95 100 205 

Itl SI? I" f" ?F F*'* ^ ^'^^ "^CT CTA AAG CCA GGT 385 

Tyr Val Asp Leu Tyr Leu He His Ser Pro Met Ser Leu Lys Pro Gly 



433 



461 



35 "° "5 120 125 

5?^ 5^"^ GAA AAT GGA AAA GTA ATA TTT GAC ATA 
Glu Glu Leu Ser Pro Thr Asp Glu Asn Gly Lys Val He Phe Asp He 
150 135 

40 GTG GAT CTC TGT ACC ACC TGG GAG GCC ATG GAG AAG TGT AAG GAT GCA 
Val Asp Leu Cys Thr Thr Trp Glu Ala Met Glu Lys Cys Lys Asp Ala 

150 155 

4S ^ 11^. ^ l^^ ®" '^^^ "TTC AAC CGC AGG CAG CTG 529 

45 Gly Leu Ala Lys Ser He Gly Val Ser Asn Phe Asn Arg Arg Gin Leu 

165 170 

5?? ^ CCA GGA CTC AAG TAC AAG CCT CTC TGC AAC 577 

50 V5t ^y^ ^^y Ly« P«o Val Cys Asn 

ifo 180 185 

CAG GTA GAA TGT CAT CCG TAT TTC AAC CGG AGT AAA TTG CTA GAT TTC 625 
Gin Val Glu Cys His Pro Tyr Phe Asn Arg Ser Lys Leu Leu Asp Phe 
55 200 205 

TGC AAG TCG AAA GAT ATT GTT CTG GTT GCC TAT AGT GCT CTG GGA TCT 673 
Cys Lys Ser Lys Asp He Val Leu Val Ala Tyr Ser Ala Leu Gly Ser 
210 215 220 



^ ?^ ^ ^'^^ WIC TCC CCG GTG CTC TTG GAC 721 

Gin Arg As? Lys Arg Trp Val Asp Pro Asn Ser Pro Val Leu Leu Glu 
225 230 235 

?" T^'T TTG GCA AAA AAG CAC AAG CGA ACC CCA GCC -»69 

Asp Pro Va Leu Cys Ala Leu Ala Lys Lys His Lys Arg Thr Pro Ala 
2«% 245 250 
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CTG ATT GCC CTG CGC TAG CAG CTG CAG CGT GGG GTT GTG GTC CTG GCC 817 
Leu lie Ala Leu Arg Tyr Gin Leu Gin Arg Gly Val Val Val Leu Ala 
255 260 265 

5 AAG AGC TAG AAT GAG CAG CGC ATC AGA CAG AAC GTG CAG GTT TTT GAG 865 
Lys Ser Tyr Asn Glu Gin Arg lie Arg Gin Asn Val Gin Val Phe Glu 
270 275 280 285 

in ^ GCC ATA GAT GGC CTA GAC AGA 913 

10 Phe Gin Leu Thr Ala Glu Asp Met Lys Ala He Asp Gly Leu Asp Arg 
290 295 300 

AAT CTC CAC TAT TTT AAC AGT GAT ACT TTT GCT AGC CAC CCT AAT TAT 961 
Asn Leu His Tyr Phe Asn Ser Asp Ser Phe Ala Ser His Pro Asn Tyr 
13 305 310 315 

^ ?yr ^ ^ lyl TTGCCTGATG ATGTCTACCA 1012 
320 

GAAGGCCCTG TCTGTGGATG GTGACGCAGA GGACCTCTCT ATGCCGGTGA CTGGACATAT 1072 

CACCTCTACr TAAATCCGTC CTCTTTACCG ACTTCAGTCA ACTACAGCTC ACTCCATAGG 1132 

25 CCAGAAATAC AATAAATCCT GTTTAGCGAC TTCACTCAAC TACA6CTCAC TCCATAGGCC 1192 

AGAAATACAA TAAA j^206 

30 (2) INRDRMATXON FOR SEQ 10 NO; 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 324 amino acids 

(B) TYPE: amino acid 
^3 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

^ (xiJ SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Asp Ser Lys Gin Gin Cys Val Lys Leu Asn Asp Gly His Phe Met 
^5 10 15 

Pro val Leu Gly Phe Gly Thr Tyr Ala Pro Pro Glu Val Pro Arg Ser 
20 25 30 

Lys Ala Leu Glu Val Thr Lys Leu Ala lie Glu Ala Gly Phe Arg His 
35 40 45 

He Asp Ser Ala His Leu Tyr Asn Asn Glu Glu Gin Val Gly Leu Ala 
50 55 60 

lie Arg Ser Lys He Ala Asp Gly Ser Val Lys Arg Glu Asp He Phe 
65 70 75 80 

Tyr Thr Ser Lys Leu Trp Ser Thr Phe His Arg Pro Glu Leu Val Arg 
85 90 95 

gQ Glu Asn Ser Leu Lys Lys Ala Gin Leu Asp Tyr Val Asp 



SO 

5S 



65 



105 lio 

Leu Tyr Leu He His Ser Pro Met Ser Leu Lys Pro Gly Glu Glu Leu 
115 120 125 

Ser Pro Thr Asp Glu Asn Gly Lys Val He Phe Asp He Val Asp Leu 
130 135 140 
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Cys Thr Thr Trp Glu Ala Met GIu Lys Cys Lys Asp Ala Gly Leu Ala 
145 150 155 160 

Lys Ser He Gly Val Set Asn Phe Asn Arg Arg Gin Leu Glu Met He 
5 165 170 175 

Leu Asn Lys Pro Gly Leu Lys Tyr Lys Pro Val Cys Asn Gin Val Glu 
180 185 190 

10 Cys His Pro Tyr Phe Asn Arg Ser Lys Leu Leu Asp Phe Cys Lys Ser 
195 200 205 



IS 



30 



35 



55 



65 



Lys Asp He Val Leu Val Ala Tyr Ser Ala Leu Gly Ser Gin Acq Asp 
210 215 220 

Lys Arg Trp Val Asp Pro Asn Ser Pro Val Leu Leu Glu Asp Pro Val 

225 230 235 240 



Leu Cys Ala Leu Ala Lys Lys His Lys Arg Thr Pro Ala Leu He Ala 
20 245 250 255 

Leu Acg Tyr Gin Leu Gin Arg Gly Val Val Val Leu Ala Lys Ser Tyr 

260 265 270 

25 Asn Glu Gin Arg He Arg Gin Asn Val Gin Val Phe Glu Phe Gin Leu 
275 260 285 



Thr Ala Glu Asp Met Lys Ala He Asp Gly Leu Asp Arg Asn Leu His 
290 295 300 

Tyr Phe Asn Ser Asp Ser Phe Ala Ser His Pro Asn Tyr Pro Tyr Ser 
305 310 315 320 

Asp Glu Tyr * 

<2) ZNroRMATION FOR SEQ 10 NO: 3: 



(i) SEQUENCE CHARACTERISTXCS: 
40 (A) LENGTH: 730 base pairs 

(B) TYPE: nucleic acid 
:C; STRANOEDNESS : single 
;D) TOPOIjOGY: linear 

45 (ii) MOLECULE TYPE: DNA (genomic) 

(ill) HYPOTHETZCAL: NO 

(iv) ANTI-SENSE: NO 

50 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AAGAACAAAT ACTATTAAGG CACTGCTTGC ATATATTAAA TGATGTCCAA ACTCCAAAAA 60 

CTGTTAATAA TTAACACTCC AATAAAAACT ACACCAGAAT TTCTTTTTAT TTGCACCCTC 120 

60 ATCAGGATTA CAGCTTTATC AGGACTGCAT CTTCTTCAGA AATGAATATT TCTCTTACAA 180 

CGCAAAGAAA GAAAAATCAA AATAAATTTT CTGATTGAAA ATGTAAAAAG GCAAATATTT 240 

TTACAGTTTT AACTTTAATT TTTTATTGAG GACCAACTGT TTGAAAAATT CTCATTAGTC 300 

ATTCCTTTAA ATTATGTGTA T6TGASAGAA AGACGTAAGA TGGTTAATTA 7TTCAAATGA 360 
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TGCAGTATAA AGAAGGGGCA TTATCACGGC AGAAACGAAA AAAGATATTT GTAGCTGGW; 
GTTTTTATAG TCTAACATAT GGTTGCTATT TGTTCTACAA ATCCTTTTGA ATAATTTAAT 
5 ATAGAGATTT CGAATAGAAA ATAATACTTT AGATAGAAAT TAATGAGTTT ATTATAACCA 
TATATTATAA TAATTTACTT AC5GAATTCTC TTTGATAAGA AACAAATGAA CTGAATGCAA 
TTTTCTCCAC AGACCATATA W5ACTGCCTA TCTACCtCCT CCTACATGCC ATTGGTTAAC 660 
CATCAGTCAG TTTGCAGGGG TGGGGGGAGG GGTTTCCTGC CCATT G TTTT TGTAATCTCT 
GAGGAGAAGC 
15 (2) INTORMATION FOR SEQ ID NO: 4: 

(i> SEQUENCE CHARACTERISTICS: 

I A) LENGTH: 121 base pairs 
on TYPE: nucleic acid 

^ IC) STRANDEONESS: slnole 

!D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DKA (genomic) 
25 (iii) HYPOTHETICAL: MO 

(iv) ANTI-SENSE: NO 

30 (ix) EXATDRE: 

(A) NAME/KEY: exon 
(Bl LOCATION: 38.. 121 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ACCAGCAAAC ATTTGCTAGT CAGACAACTG ACAGGGAATG GATTCCAAAC AGCACTOTGT 
^ AAAGCTAAAT GATGGCCACT TCATGCCTGT ATTGGGATTT GCCACCTATG CACCTCCAGA 

G 

:2) INFORMATION FOR SEQ ID NO: 5: 

45 (i) SEQUENCE CHARACTERISTICS: 

:A) LENGTH: 252 base pairs 

(B) TYPE: nucleic acid 
:C) STRANDEDNESS: Single 

^ (3) TOPOLOGY: linear 

Ui) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
55 (iv) Aim-SENSE: NO 



60 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 

GTAAGAATAA TTCCTTTTAG TTTTCGGATT TCAAAAGAAT AAACCTAGTA GAAGTGAAAC 
65 ^^'^™G3 TTGTAA6GTT CGTGTTCCTA CCTTACTCTG GATGACTCAC TGGTCTAGGT 120 



60 
120 
121 



60 



TTCCTAGGCT AGGAGAAAAA AGTAGGCAAT CCTTGTTCTG CATTGAGGTC CATTCCTATG 



180 
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GTCACGTACT GCTTATTTTT CGTTTGTGCA CTGTTTCTTT CTTCTGTTCA TGTCTAGTTC 240 
CCAGCTTGGC AG 
S (2) INFORMATION FOR 5£Q ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4X0 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEC3NESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 

IS (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

20 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
^ GGAAGTCTGA GTGAGCATTC TGTGTAATAT CACTGGGAGA GAACTCATAT GAGCTTGCAC 

CGTTTCCCTT CTATACTCCA TGTGATTTTT ACCATGTATA ATATCACTAT ATTAAAAATA 120 

ATTAGGACTA TTTCAGTCAT GTTAACTTTT CCAACAAATC ACTGAATCTG AGGGTGTTAT 180 

30 GTOGTACCTC CATAACAGTG ATCAACCAGA GATTGCCTGA GACTGAAGGT GTTlVr GGSA 240 

T6CTCAACCT TTATTACTAA CCAGGAAAGA CTCAGGCAAA CTGAGATGGA CTTTTCACCC 300 

CACATACAGA CAGGAGGAAA AGCTGATTCT TGTATAAAAG TCAATGCTTG TGCCTGAACT 360 

ACCTCTCAGC CACAGTGATC ACCAGATACT ACCTTTGGTT GCTCCTCCAG 410 
(2) INEX}RMATION FOR SEQ ID NO: 7: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEI^IESS: single 
<D) TOPOX^Y: linear 

43 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
SO (iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME /KEY: exon 
S5 (B) LOCATION: 1..16e 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 
60 GTTCCGAGAA GTAAACKrTTT GGAGGTCACA AAATTAGCAA TAGAAGCTGG GTTCCGCCAT 60 
ATAGATTCTG CTCATTTATA CAATAATGAG GAGCAGGTTG GACTGGCCAT CCGAAGCAAG 120 
ATTGCAGATG GCAGTGTGAA (SAGAGAAGAC ATATTCTACA CTTCAAAG 168 
(2) INFORMATION FOR SEQ ID NO: 8: 
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30 
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(i) SEQUENCE CHARACTERISTICS: 

lA) LENGTH: 700 base pairs 

(B) TYPE: nucleic acid 

(C) STMNOEDNESS: single 
CO) TOPOLOGY: linear 

(ii) HOLECOLE TYPE: DNA (genomic) 

(iii) HYPOTKETZCAL: NO 

(iv) ANTI-SENSE: NO 



2^ CTTTACTTTC TAAGCAACAT AAATAGCTAT TCTTAAGCAT TGGGTTGAAT GGATAGAAGA 540 

ATTAGACTGT TAAAATGAGT TGTAAACTCT ACTGAAGATA ATTCAGGTAA CATCATAGTT 

ATTACTTAAT ACTAATCTTT ACATTTTAAG AATTTACTCC TATCATTCAG TAGATGTACA 

40 AACTATACAT CCAACCTATA ATAAAGTTTA TAAGGATAGG 

(2) INFORMATION FOR SEQ ID NO: 9: 

<i) SEQUENCE CHARACTERISTICS: 
^5 (Al LENGTH: 7 3 base pairs 

(B) TYPE: nucleic acici 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 

SO (ii) NOLECOLE TYPE: K<A (genomic) 

(iii) HYPOTHETZCAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTICS^ : SEQ ID NO: 9: 
ACTAGATGGC ACAAAGTAAT AAGATTTGCT CAAGCATTCA TTCAAAATCA CCTCCATTCT 
TTAACCTCTG CAG 
65 (2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 



60 
120 



(xi) SEQUENCE DESCRIPTION: SEQ ID HOiBt 
GTACTGTGTC TATGATGAGC TTGTGTGCAC ATGTATTTAT TGTGATTCTG TGGAGATGAC 
AATTCTATGA CTGGATGAGT AGTTGTGGGT GAATTTTGCT TCTGGGTTCA AATTTATTCA 
CACATACTCA CATACTAAAA CTGAAATCAA AATCAAGGAA TGATGATCAC TTTTCATTTT 160 
2^ CGCT6TGTTC CAATTTATGA CCTGAAAGTC CCTTTACTTT TTTGAGCTTC AGCCGAGATC 240 
AGTGTGATTT GACATGTGCT ATAGAATCAC AGAGAACAAT AATCATGTTA TGGTTTTTCT 300 
TATCGCCTGG GTGATTTTCT AAGATTTCTT ATTATTCTCT CAATTGCTAT CTTTATCAGT 360 
GAGATAGAAA GCAATATAAG AAAGCTCTGG 6AGTATTAAA TAATAGACAC TTAAATT6TC 
CTAAATTGTG TCCAGCATAG TGAGCATGTT CAAAACTTGT TTTACCCCCC TTTTATGTTG 



420 
480 



600 
660 
700 



60 
73 
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(A) LENGTH: 117 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDHESS: single 
^ (0) TOPOLOGY: linear 

(11) HOLECULE TYPE: DNA (genomic) 

(ill) HYPOTHETICAL: NO 

10 <iv) ANTI-SENSE: NO 

(Ix) FEATURE: 

(A) NAME/KEY: exon 
15 (B) LOCATION: l..ll"7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
20 CTTTGGTCCA CTTTTCATCG ACCAGAGTTG GTCCGACCAG CCTTGGAAAA CTCACTGAAA 60 
AAAGCTCAAT TGGACTATGT TGACCTCTAT CTTATTCATT CTCCAATGTC TCTAAAG 117 
(2) INFORHATION FOR SEQ ID NO:ll: 

25 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 152 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ill) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



35 



40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 
GTATGCAGTT TGTATGAGCA TAAAATTGCG CTTCTGCTCTr CATTATAAAC ATTGTTTATC 60 
45 TGGATAGTTG AACAGAGCTT TTTATTAGGA GGATGTAGGG ATTATCACAC AGAAGAAGAA 120 
CCGTAAGTGG AACACCTAAT TTCCTTTCTT TC 152 
(2) INTORMATION FOR SEQ ID NO: 12: 

50 

(i) SEQUENCrE CHARACTERISTICS: 

(A) LENGTH: 208 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 (D) TOPOLOGY: linear 



60 



65 



(il) MOLECULE TYPE: DNA (genonic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE s NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
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ATATAATATT TGTAAGAGAT TAGAGGAAGC CTGTCTCCTG AATACATTCC TTArACCTTC 60 
ATATGTAAAA CACTTAGCAC ATAtCACTTT CTGGACCATT GTACCACCTG TCTCATGGAG 120 
SATTAGTGTC CTTAAAGGTA CCTeGGGTTA CAGCTATGAG TGGAGAAATT AATTTGTGAC 180 
ATCATTAAAA TGACTGCTTC TATTTCA6 
C2) INFDRMATION FOR SEQ ID NO: 13: 



Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 
xc STRANDEDNESS: single 
1^ (D) TOPOLOGY: linear 

(li> MOLECOLE TYPE: DNA t genomic) 

Ciii) HYPOTHETICAL: NO 

(Ivl ANTI-SENSE: NO 



(Ix) FEATURE: 

CA) NAME/KEY: exon 
(B) LOCATION: 1..78 



30 SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

CCAGGTGAGG AACTTTCACC AACAGATGAA AATGGAAAAG TAATATTTGA CATAGTGGAT 60 
CTCTGTACCA CCTG06AG 

78 

35 .2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 base pairs 
^ (B) TYPE: nucleic acid 

W (C) STRANDEDNESS: single 

'D\ TOPOLOGY: linear 

Ui) MOLECULE TYPE: DNA (genomic) 
45 Ciii) HYPOTHETICAL: NO 

<iv) ANTI-SENSE: NO 

SO 

Ixil SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

3TGAGTGCT7 GGCOGAGAGG ACACAGAGAA GGATGACAAA AAGAGAAAAT CTGTTTCCCA 60 

3GTTCGATAG GAAAGAATGG AATATGCACC ' ATTAGATC 9g 

2) INFORMATION FOR SEQ ID NO: 15: 

60 (i) SEQUENCE CHARACTERISTICS: 

:A) LENGTH: 249 base pairs 
!3) TYPE: nucleic acid 
:C) STRANDEDNESS: single 
*:^) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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liii) HYPOTHETICAL: NO 
(iv) ANTI-SCKSE: NO 



(Hi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

10 GACAGGAATC TCTTTCCTTG CTTGTGCATT AATCTATGCA GTTTCCTAAG GAAGAGATAG 60 

AAATTCTTAC TCTTGCTGCC TCTATCTTCT TCCCCTATTT GCTGTTTGAA TTTTTCTTTT 120 

TTTGACAATC ACTGCTAGCT ATTTTCATTG TCATACTTTG AAAGTTGTTG CTCTCACA6T 160 

TCTGTCTTGC ATTTACCGTG ATTTGCAGCC AACTGCACAft ATAATTCCTC ACAACCCCTT 240 

TCTCCACAG 249 
20 <2) INFORMATION FOR SEQ ID NO:16: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANOEONESS: single 

(D) TOPOIOGy: linear 

Ui) MOLECULE TYPE: DMA (genomic) 
30 liii) HYPOTHETZCAL: MO 

(iv) ANTI-SENSE: NO 

35 (ix) FEATURE: 

(A) NAME/KEY: exon 
(B» LOCATION: 1..123 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GCCATGGAGA AGTGTAAGGA TGCAGGATT6 GCCAAGTCCA TTGGG6TCTC AAACTTCAAC 60 

CGCAGGCAGC TGGAGATGAT CCTCAACAAG CCAGGACTCA AGTACAAGCC TGTCTGCAAC 120 

CAG 123 

(2i INFORMATICS) FOR SEQ ID NO: 17: 

50 (i) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 138 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEiniESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
fill) HYPOTHETICAL: NO 
60 liv) ANTI-SENSE: NO 

65 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 

(TTGAGCTCCC TTG(;CCTTCT CTCCTTTC<jG TTCTTCATGC CCCCTCTTCC TGTCCTATTG 60 
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CCAAATATCT CTTTGfTTTTG TCCCAGTTAT CTTTGTGAAG TAGAAGWTA TCTAGAGW5C 
AAAGCTTCT G TCAAGAAA 
(2> INFORMATION FOR SEQ ID N0:18: 

(1) SEOOENCE CHARACTERISTICS: 

(A) LENGTH: 189 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single . 

(D) TOPOLOGY: linear 

(11} MOLECULE TYPE: OKA Cgenanicl 
(ill) HYPOTHETICAL: KO 
(iv) ANTI-SENSE: NO 

<xi) SEQUENCE DESCRIPTim: SEQ ZD NO:18: 
ATTTCCATTT ATACTTTTAG AAGATATATA AAATTTATTT CTATGAAAAA GGTTATTACT 
TGACAATAAT ATCCTCAGCT CAAATATAAT GCTATACTGA TTATTATTCA GCTTCCTTAC 
TTTCATCTTT TCAATATTAA CATAACTATT TCATATAAAT TCATGCTTCT CTCTTTTGGT 
CAACTGCAG 



12) INPORMATZON TOR SEO ZD NO:19: 

(i) SEQUENCE CHARACTERISTZCS : 

(A) LENGTH: 110 base pairs 

(B) TYPE: nucleic acid 

(C) STRANZ3EDNESS: single 
<D) TOPOLOGY: linear 

111) MOLECULE TYPE: ONA (gencmic) 

(ill) HYPOTHETICAL: NO 

liv) ANTI*SENSE: NO 



(Ik) FEATURE: 

I A) NAME/KEY: axon 
IB) LOCATION: 1..110 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:19: 
GTAGAATGTC ATCCGTATTT CAACOGGAGT AAATTGCTAG ATTTCTGCAA GTCGAAAGAT 
ATTGTTCTGG TTGCCTATAG TGCTCTGGGA TCTCAACGAG ACAAACGATG 
(2) INFORMATION FOR SEQ ID N0:20: 

(i> SEQUENCE CHARACTERISTZCS: 

(A) LENGTH: 136 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DHA (genonao 
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(iii) KYPOTRETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQOENCE DESCRIPTION: SEQ ID MO: 20: 
GTAATAAAAA CAATGGGACC TTTACATAAA CCTTCATTTT GCAGAAAATT TTTTACTCAG 60 
AGCATCCTCA GTTTCCTGTA GTTAAGTTTC AAGTGGCTCA TGGAGAGGAA AGAGAATTGC 120 
15 GTTTCTGACG AGATCT j^3g 
(2) INFORMATION FOR SEQ ID NO: 21: 

„ (i) SEQUENCE CHARACTERISTICS: 

211 (A) LENGTH: 66 base pairs 

(B) TYPE: nucleic acid 

(C) STRANKDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
Civ) ANTI*SENSE: NO 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
TTTAGGGAGC TGCCTAACAA ACTATCGGCA GCCTCAGGGC CTCAG CC TTT CTGCCTTTCC 60 
TTCCAG 

40 (21 INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

;A) LENGTH: 166 base pairs 
'B) TYPE: nucleic acid 
43 {C) STRANDEDNESS: single 

(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

55 (ix) FEATURE: 

(A) NAME/KEY: axon 

(B) LOCATION: 1..166 

60 Cxi) SEQUENCE DESCRI PT1(»4 : SEQ ID NO:22: 

GGTG<5ACCCG AACTCCCCGG TGCTCTTGGA GGACCCA6TC CTTTGTGCCT TGGCAAAAAA 60 

GCACJMIGCGA ACCCCACCCC TGATTGCCCT GCGCTACCAG CTGCAGCGTG GGGTTGTGGT 120 

CCTGGCCAAG A6CTACAATG AGCAGCGCAT CACACAGAAC CTGCAG 166 
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:2J INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 136 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI*5ENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 



3TGAGGAGCG GGGCTGTGGG CCTCAGGTCT CCTGCACAGT GTCCTTCACA CGTGTGCTTC 
TTGTAAGGCT CTCAGGACAG CCTTGGGCCA GCTCCATTTC CCTGTATTTC CCATATGAAT 



60 
120 

2j GCTTTGCGTG CATCCT 

!2} INFORMATION FOR SEQ ID NO; 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 266 base pairs 
W (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

45 rCCTATCATG 7GGGCACAAT GTCAGCOCTG TTTCTTCTCC ATTTTCTGTT GAAATTTTCT 60 

CTTTGTCTGC AGAGTTGCAC AGTTTCAATA CATAATATCT AGGAATCGAT TTCTGCTTAT 120 

^ TTTTCGTGAG CTATTCATTG ACCCACCTGA GTGTTTAGAG CTCACTTCTA TAACTGTTTA 180 

AAACTTACCA ATATTTTAAG TATTGTCTCT GCACCCTACT GTCTAATATA CTTGGGGATT 240 

CACAACTC<3C AATCTAAAAA TAATAAAAGT TTTTTATTTC TGATAG 286 

55 .2) INFORMATION FOR SEQ ID NO: 25; 

(i) SEQUENCE CHARACTERISTICS: 
A) LENGTH: 63 base pairs 
^ .3) TYPE: nucleic acid 

C) STRANDEDNESS: single 
.3) TOPOLOGY: linear 

(ii) KCLECUtX TYPE: DNA (genonic) 

65 (iii) HYPOTHETICAL: NO 

(iv) AI.TI-SENSE: NO 
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(Ix) FEATURE: 

(Al NAME/KEY: exon 
5 (B) LOCATION: I.. 83 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

10 6TTTTTGAGT TCCAGTTGAC TGCAGAGGAC ATGAAAGCCA TAGATGGCCT AGACAGAAAT 60 

CTCCACTATT TTAACAGTGA TAG 83 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

<AI LENGTH: 713 base pairs 
(Bl TYPE: nucXeic acid 
(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

<iv} ANTI-SENSE: NO 



30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

GTAAGTTTCC TrnTTAAATG G<3TGATCTAA TTTATTTCTG GAGAACGAAT GTAGGATGGG 60 

35 TGTTGAGAGT GACCTCCATA CCAGAGGGAC AGAGGCCAAT GTGAGTCAGA GGTGAGACrTG 120 

GAACTCTCCT GCTGGATTCA CTCCAGAGCT CTGTTCTCTG GCAGGGTGAG TGGGCAGGGA 160 

TCAGCATGGG TCAACCTGTG CCTCTGCTCT CCTGACTCCA TGGAACTTTC CAGAGCAGCC 240 

40 

AACATCATTG CCAA6TCTGC AC6TTCCATA TA0GCCTG6T GTTTCTACCA CTGGACATGC 300 

TGTGGATACT GCCCATGTGA CTTCATTAGA TGTTTCCAAA TCTGTGCTTA TATCACATTG 360 

45 TCCCAAACCT GCTCAGCTCC TTATCAAATC AAAAACATTT CCATCrAACTT TGTGGTCCAG 420 

(^GCXAATTC CCACCTCCTT CATATGGAAT TGCTTGCTAG ATCCTGTCAA TTCAGCATCT 480 

TTTATTATTT CAAATGTTTT TCCTCCTTCT CCTTGCAGGT TTGTTCATGC CCCAAACTCT 540 

GCTTTTGCCT CCAGAAA6CC TTCCTTAGTG GAGTGAATAG GAGTGCTTGT CCTTGATTTC 600 

CTGCAATATG GA6CTCTCAA GGCAGAGAAT TTAAAAAAAT TTAAAATCAA GGAGTGTGAG 660 

55 TGTGGAGGCA GAAGCTCCAT TGTTGTATAT AATTTGTAGC TGATAAAAGA TCT 713 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 
60 :A) LENGTH: 415 base pairs 

•'3) TYPE: nucleic acid 
:C) STRANDEDNESS: single 
!3) TOPOLOGY: linear 

65 Iii) MOLECULE TYPE: DNA (genomic} 

(iii) HYPOTHETICAL: NO 



suBsnruTE sheet (rule 26) 



wo 97/1 1 ta ^ PCT/C A96/00605 



10 



20 



-32- 

<iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2*7: 

TTTAATGCAC TGTAGCTCCT TGGATATTAG ACCCTATATC ATATATAACA ATTTACATTT 60 

CTCAATCTTA CAAAATATAT TGCATACAGT AGGCAGTAGC AGGTAATAAG TAAAGTAACA 120 

AAAGAAAGTA TAATCAGAGT ATCTCTGCTC T6CTGACAGA TGTACAGGAA TATACTTGAA 180 

15 TATTTGACTT TGTGT€?rTTT ACGTGTTAAC TTCCAGATAA GGGAATATGA TTGAATAATT 240 

TATTATTTTG AAAATACTGT ATTATGAAGC CATGTTCATA AAGGTAAGAA AGGCAGATTC 300 

TACAACTAGT CAGACAACTT AACATTCATA CTAATGACAG CTTCATTGAA ATCACTTTAC 360 

TACTCCCCTA GTAATGGAGT CATTCCATTT ATATTATACA TTATTCTCTT TTCAG 415 

(2) INFOBHATION FOR SEQ ID NO: 28: 

23 (i) SEQUENCE CHARACTERISTICS: 

(A> LENGTH: 230 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES5: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: DNA (genonic) 
<iii) HYPOTHETICAL: NO 
35 riv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: exon 
40 (B) LOCATION: 1..230 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

45 TTTTGCTAGC CACCCTAATT ATCCATATTC AGATGAATAT TAACATGGAG GGCTTTGCCT 60 

GATGATGTCT ACCAGAAGGC CCTGTGTGTG GATGGTGACG CAGAGGACGT CTCTATGCCG 120 

^ GTGACTGGAC ATATCACCTC TACTTAAATC CGTCCTGTTT AGCGACTTCA GTCAACTACA IBO 

GCTGAGTCCA TAGGCCAGAA AGACAATAAA TTTTTATCAT TTTGAAATAA 230 

(2) INFX>RMATION FOR SEQ ID NO: 29: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 109 base pairs 

(B) TYPE: nucleic acid 
fC) STRANDEDNESS: single 

,^ (D) TOPOX-OGY: linear 

oO 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

65 (iv) Atm-SENSE: NO 



SUBSTITUTE SHEET (RULE 26) 



WO»7/in62 — ^^r/CA96«0605 



PCT/( 



10 



20 



25 



45 
SO 



60 
65 



•33 



(xi) SEQUENCE D£SCRIPTrc»4: SEQ ID NO: 29: 
rTGAATGTTT TCTCAAAGAT TCTTTACCTA CTCTGTTCTG TAGTGTGTGT TTTCTTCTGG 60 
CTCAGAAGTG TGTGTGTGTG TGTGTGTGCT TTCTTCTGGC TCAACAGGG 109 
(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
IS (D| TOPOLOGY: linear 



(ii) MOLECULE TYPE: ONA (genonic) 
Jiii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TTTAGCTTTA CACACTGCTG TT 22 

30 (2) INEORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
CB) TYPE: nucleic acid 
35 (C) STRANDEDNESS: single 

iD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic} 

40 (iii) HYPOTHETICAL: NO 

(iv) AitTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

TccAAACKrrr tacttctcgg 20 

(2) INFORMATION FOR SEQ ID NO: 32: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 16 base pairs 
55 ;3) TYPE: nucleic acid 

:C) STRANDEDNESS: single 
D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic I 
(iii) HYPOTHETICAL: NO 
(iv) Arm-SENSE: YES 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 



oATGAAAAGT GGACCA 



16 
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2) INFORMATIC»l TOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOIOGY: linear 

(il) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(ivl ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
ATCTGTTGGT GAAA6TTC IB 
2) INFORMATION TOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 
;C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: IMA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI*SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

rrCAGCTGCC TGCGGT 16 

2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
rS) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
:D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) A::ri-SEHSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
rrTGTACTTG A(n*CCTG 17 



SUBSTITUTE SHEET (RULE 26) 



wo 97/11162 



/CA96/00605 



10 



15 



40 



•35- 

{2: INFOPitfiTION FOR SCQ ID NO: 36: 

li) SEQUENCE CHARACTERISTICS: 
fA) LENGTH: 16 base pairs 
•21 TYPE: nucleic acid 
;C) STRANDEONESS: single 
i2) TOPOLOGY: linear 

Ui) MOLECULE TYPE: DNA (genomic} 

liii) HYPOTHETICAL: NO 

(iv) M;TI-SENSE: YES 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

20 CTCCGGTTGA AATACGGA 

;Z) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 20 base pairs 

{3) TYPE: nucleic acid 
iZ) STRANDEDNESS : single 
(0) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: DNA r genomic) 

:iii) HYPOTHETICAL: NO 

tiv) Arm-SENSE: YES 

35 



(Xil SEQUENCE DESCRIPTION: SEQ ID NO: 3? : 

CATCGTTTGT CTCGTTGAGA 

:2: IKFORI'IATION FOR SEQ ID NO: 36: 

45 '.il SEQUENCE CHARACTERISTICS: 

A) LENGTH: 22 base pairs 
"S) TYPE: nucleic acid 
r) STRANDEDNESS: sinqi^ 
:r) TOPOLOGY: linear 

50 

(ii) tvCLECULE TYPE: DNA taenomicj 
.iiil HYPOTHETICAL: NO 
55 liv) A.\TI-SENSE: YES 



60 (xi) SEQUENCE DESCRIPTIC!!: SZZ ID NO: 39: 

TCACTGTTA". AATAGTGGAG AT 
2: INFOKl-.TION FOR SEQ ID SO: 5?: 



65 



;i) SEQUENCE CHARACTERISTICS: 
A) LENGTH: 17 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEONESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic} 
Ciii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQOENCE DESCRIPTION: SEQ ID NO: 39: 

ATCTGAATAT GGATAAT 17 

(2) INFORMATION FOR SEQ 10 NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEC»IES5: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genmaic) 
(iii) HYPOTHETICAL: NO 
(iv> ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

rrCTCGGAAC CTGGAGGAGC ;^>0 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 
;A) LENGTH: 20 base pairs 
3 J TYPE: nucleic acid 
:Z) STRANDEDNESS : single 
■7) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) A!;?I -SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEC ID NO: 41: 

GACACAGTAC CTTTGAAGTG ZO 

(2) INFORMATION FOR SEQ ID NO: 42: 

?i) SEQUENCE CHARACTTERISTICS: 
:;^) LENGTH: 20 base pairs 
.3) TYPE: nucleic acid 
CS STRANDEDNESS: single 
:Z') TOPOLOGY: linear 
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(11) NOLECUXX TYF£: DNA (Genomic) 
I ill) HYPOTHETICAL: NO 
(Iv) AiaTX-SENSi:: YES 



10 (xi) SEQUENCE DESCRIPTIOH: SEQ ID NO: 42: 

TGGACCAAAG CTGCAGAGGT 20 
(2) INFOBMATION TOR SEQ ID NO: 43: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pales 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 
20 (D) TOPOLOGY: llneac 

(11) MOLECOIX TYPE: DNA (genome) 

(ill) HYPOTHETICAL: NO 

25 

(iv) ANTI-SENSE: YES 



30 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO:43: 
CCTCACCTGG CTGAAATAGA 
35 (2) INFORMATION EX5R SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

■D) TOPOLOGY: linear 

{ii> KOLECULE TY?£: DMA : genomic: 
45 liii) HYPOTHETICAL: NO 

(iv) Arn'I-SENSE: YES 



50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

AASCACTCAC CTCCCAQGTG 

(2) INFORMATION FOR SEQ ID NO: 45: 

ID SEQUENCE CHAPACTERISTICS: 
;A) LENGTH: 20 base oairs 
60 :B) TYPE: nucleic acis 

:C) STRANDEDtJESS: single 
-:D) TOPOLOGY: linear 

Jii) ::OLECOLE TYPE: DNA JgenomlcJ 

65 

I liii HYPOTHETICAL: NO 
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Uv> Aim-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

5ACATTCTAC CTGCAGTTGA 20 

iO .21 INFORI^TION FOR SEQ ID NO: 46: 

(1) SEQUENCE CHARACTERISTICS: 
!A) LENGTH: 19 base pairs 
•B) TYPE: nucleic acid 
IS iC) STRANDEDNESS: single 

ID) TOPOLOGY: linear 

(li) MOLECULE TYPE: ONA (genomic) 

20 iiii) HYPOTHETICAL: NO 

(iv) a::?i-sense: yes 



25 



30 



40 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
CTCAAAAACC TATCAGAAA 19 
,2) INFORMhTION FOR SEQ ID NO: 47: 



(i) SEQUENCE CHARACTERISTICS: 
A) LENGTH: 20 base pairs 
35 :3) TYPE: nucleic acid 

C) STRANDEDNESS: single 
'D) TOPOLOGY: linear 



(ii) K3LECULE TYPE: DNA (genomic) 
:iii) HYPOTHETICAL: NO 
!iv) fC.TI-SENSE: YES 



(xi) SEDUENCE DESCRIPTION: SEQ ID NO: 47: 

50 C-GAAACTTAC CTATCACTGT 20 

2) INFORI'J^TION FOR SEQ ID NO: 48: 

(i> SEQUENCE CHARACTERISTICS: 
55 LENGTH: 20 base pairs 

3) TYPE: nucleic acid 
C) STRANDEDNESS: single 
3) TOPOLOGY: linear 

60 I'ii) rtCLECULE TYPE: ONA (genomic) 

;iii) KiPOTHETICAL: NO 

!iv) A::ri-SENSE: YES 

65 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

^ GCTAGCAAAA CTGAAAAGAG 20 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
10 IB) TYPE: nucleic acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 



IS 



20 



45 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
<iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

25 GAGAAATATT CATTCTG 17 

(2) INFORMATION FOR SEQ ID NO: 50: 

|i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 

40 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5C: 
CGAGTCCTGA TAAAGCTG 18 
(2) INFORMATION FOR SEQ ID NO: 51 : 



SO (i) SEQUENCE CHARACTERISTICS: 

;A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Cil) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
60 (iv) ANTI-SENSE: YES 



65 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

GATGAGGGTG CAAATAA 17 
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(2) INFORMATION FOR SEQ ID NO: 52: 

(1) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(D> TOPOLOGY: Linear 

10 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

J J (iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
GGAGTGTTAA TTAATAACAG TTT 23 
(2) INrORMATION FOR SEQ ID NO: 53: 



25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
{B) TYPE: nucleic acid 
(C) STRANDEDNESS: Single 

„ (D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
35 (iv) ANTI -SENSE: YES 



40 (Xi) SEQUENCE DESCRIPTION: SEO ID NO:53: 

CAGAGATTAC AAAAACAAT 
^ _ J2i INFORt5ATIC»J FOR SEQ ID NO: 54: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 
_ (C) STRANDEDNESS: single 
50 (0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

iiiU HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



60 

(xil SEQUENCE DESCRIPTKM: SEQ ID NO: 54: 
TGCCTTTTTA CATTTTCAAT CA 
65 (2: INFORMATION FOR SEQ ID NO: 55: 
(i) SEQUENCE CHARACTERISTICS: 



SUBSTITUTE SHEET (RULE 26) 



wo 97/11162 ^pT/CA96/00605 



.41 - 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 

^ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 
Uii) HYPOTHETICAL: NO 
10 <iv) ANTI-SENSE: YES 



IS (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

ACACATAATT TAAAGGA 17 
(2) INFORMATION FOR SEQ ID NO: 56: 

20 

(1) SEQUENCE CHARACTERISTICS: 
lA) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (Dl TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genoaic) 

liii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



30 



35 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
TTAAATTATT CAAAAGG 17 
40 (2) INFORMATION FOR SEQ ID NO: 57: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
50 (iii) HYPOTHETICAL: NO 



55 



60 



(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
AAGAGAAATA TTCATTTCTG 
(2) INFORMATION FOR SEQ ID NO: 58: 



(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 20 base pairs 
65 (B) TYPE: nucleic acid 

;'C) STRANDEDNESS: single 
;'D) TOPOLOGY: linear 
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(11) MOLECULE TYPE: DNA (genomic) 
^ (ill) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 

10 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
CCCCTCCCCC CACCCCT6CA 20 
15 :2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
„ (B» TYPE: nucleic acid 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (genomic) 
25 (ill) HYPOTHETICAL: NO 

(Iv) ANTI-SENSE: YES 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
CTGCCGTGAT AATGCCCC 18 
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CLAIMS 



We claim: 



5 1. An isolated nucleotide sequence encoding type 5 17p-hydroxy steroid 
dehydrogenase, said sequence being sufYiciently homologous to SEQ ID No. 1 or a 
conv>lement thereof, to hybridize under stringent conditions to the coding region of 
SEQ ID No. i or a complement thereof and said sequence encoding an enzyme which 
catalyzes the conversion of progesterone to 20cc-hydroxyprogesterone and the 
10 conversion of 4-androstenedione to testosterone. 

2. The nucleotide sequence, as recited in claim 1. wherein said sequence is the 
coding region of SEQ ID No. I. 

15 3. A recombinant expression vector comprising a promoter sequence and a 
nucleotide sequence in accordance with claim 1. 

4. A recombinant expression vector comprising a promoter sequence and a 
nucleotide sequence in accordance with claim 2. 

20 

5. A recombinant host celL transformed or transfected with the vector of claim 4. 

6. The recombinant host cell of claim 5. wherein said host cell is a eukaryotic 



7. A recombinant host celh transformed or transfected with the vector of claim 3. 



8. The recombinant host cell of claim 7. wherein said host cell is a eukaryotic 
cell. 



9. The recombinant host cell of claim 8. wherein a nucleotide sequence that 
hybridizes under stringent conditions with SEQ ID No. I or its complement is 



cell. 



25 



30 



integrated into the genome of said host cell. 
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10. The recombinant host cell of claim 9, wherein said nucleotide sequence is 
located on a recombinant vector. 

11. The recombinam host cell, as recited in claim 8, wherein said host cell is 
capable of expressing a biologically active type 5 17p-hydroxysteroid dehydrogenase. 

12. An isolated nucleotide sequence comprisiiig at least ten consecutive nucleotides 
identical to 10 consecutive nucleotides m the coding region of SEQ ID No. 1, or the 
complement thereof. 

13. The nucleotide sequence, as recited in claim 12, wherein said sequence 
cmnprises at least fifteen consecutive nucleotides identical to 15 consecutive 
nucleotides in the coding region of SEQ ID No. 1 , or the complement thereof. 

14. The nucleotide sequence, as recited in claun 13, wherein said sequence 
comprises at least twenty consecutive nucleotides identical to 20 consecutive 
nucleotides in the coding region of SEQ ID No. 1, or the complement thereof. 

15. The nucleotide sequence, as recited in claim 13, wherein said sequence 
conq>rises at least thirty consecutive nucleotides identical to 30 consecutive nucleotides 
in the coding region of SEQ ID No. 1, or the complement thereof. 

16. An oligonucleotide sequence selected from the group consisting of 
TTTAGCTTTACACACTGCTGTT (SEQ ID No. 30), 
TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31), GATGAAAAGTGGACCA 
(SEQ ID No. 32), ATCTGTTGGTGAAAGTTC (SEQ ID No. 33), 
TCCAGCTGCCTGCGGT (SEQ ID No. 34), CTTGTACTTGAGTCCTG (SEQ ID 
No. 35), CrrCCGGTTGAAATACGGA (SEQ ID No. 36), 
CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37), 
TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38), and 
ATCTGAATATGGATAAT (SEQ ID No. 39). 
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17. An oligonucleotide sequence selected from the group consisting of 



TTCTCGGAACCTGGAGGAGC 


(SEQ 


ID 


No. 


40). 


GACACAGTACCTTTGAAGTG 


(SEQ 


ID 


No. 


41), 


TGGACCAAAGCTGCAGAGGT 


(SEQ 


ID 


No. 


42), 


CCTCACCTGGCTGAAATAGA 


(SEQ 


ID 


No. 


43). 


AAGCACTCACCTCCCAGGTG 


(SEQ 


ID 


No. 


44), 



GACATTCTACCTGCAGTTGA (SEQ ID No. 45). CTCAAAAACCTATCAGAAA 
(SEQ ID No. 46), GGAAACTTACCTATCACTGT (SEQ ID No. 47), and 
GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48). 

10 

18. An oligonucleotide sequence selected from tlie group consisting of 
GAGAAATATTCATTCTG (SEQ ID No. 49), CGAGTCCTGATAAAGCTG (SEQ 
ID No. 50). GATGAGGGTGCAAATAA (SEQ ID No. 51). 
GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52). 

15 CAGAGATTACAAAAACAAT (SEQ ID No. 53). 

TCjC Ci ri I l A CATTTTCAATCA (SEQ ID No. 54). ACACATAATTTAAAGGA 
(SEQ ID No. 55), TTAAATTATTCAAAAGG (SEQ ID No. 56), 
AAGAGAAATATTCATTTCTG (SEQ ID No. 57). 

CCCCrrCCCCCCACCCCTGCA (SEQ ID No. 58), and 

20 CTGCCGTGATAATGCCCC (SEQ ID No. 59). 

19. A recombinant expression vector comprising: 
a promoter sequence: and 

an oligonucleotide sequrace selected firom the group consisting of 
25 TTTAGCTTTACACACTGCTGTT (SEQ ID No. 30). 

TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31), GATGAAAAGTGGACCA 
(SEQ ID No. 32). ATCTGTTGGTGAAAGTTC (SEQ ID No. 33). 
TCCAGCTGCCTGCGGT (SEQ ID No. 34), CTTGTACTTGAGTCCTG (SEQ ID 
No. 35), CTCCGGTTGAAATACGGA (SEQ ID No. 36). 
30 CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37), 

TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38), and 
ATCTGAATATGGATAAT (SEQ ID No. 39). 
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20. A recoihbinant expression vector comprising: 
a promoter sequence; and 

an oligonucleotide sequence selected from the group consisting of 



TTCTCGGAACCTGGAGGAGC 


(SEQ 


ID 


No. 


40). 


GACACAGTACCTTTGAAGTG 


(SEQ 


ID 


No. 


41). 


TGGACCAAAGCTGCAGAGGT 


(SEQ 


ID 


No. 


42), 


CXntlACCTGGCTGAAATAGA 


(SEQ 


ID 


No. 


43). 


AAGCACTCACCTCCCAGGTG 


(SEQ 


ID 


No. 


44). 



GACA1TCTACCTGCAGTTGA (SEQ ID No. 45). CTCAAAAACCTATCAGAAA 
10 (SEQ ID No. 46). GGAAACTTACCJTATCACTGT (SEQ ID No. 47), and 
GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48). 

21. A Fecombinant expression vector comprising: 
a promoter sequence; and 

IS an oligonucleotide sequence selected ham the group consisting of 

GAGAAATATTCATTCTG (SEQ ID No. 49). CGAGTCCTGATAAAGCTG (SEQ 
ID No. 50). GATGAGGGTGCAAATAA (SEQ ID No. 51). 
GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52). 

CAGAGATTACAAAAACAAT (SEQ ID No. 53), 

20 TGCCTTTTTACATTTTCAATCA (SEQ ID No. 54). ACACATAATTTAAAGGA 
(SEQ ID No. 55), TTAAATTATTCAAAAGG (SEQ ID No. 56), 
AAGAGAAATATTCATTTCTG (SEQ ID No. 57). 

CCCCTCCCCCCACCCCTGCA (SEQ ID No. 58). and 
CTGCCGTGATAATGCCCC (SEQ ID No. 59). 

25 

22, A method of blockii^ synthesis of type 5 17P-HSD. comprising the step of: 
introducing an oligonucleotide selected from the group consisting of 

TTTAGCnTACACACTGCTGTT (SEQ ID No. 30). 

TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31), GATGAAAAGTGGACCA 
30 (SEQ ID No. 32), ATCTGTTGGTGAAAGTTC (SEQ ID No. 33). 
TCCAGCTGCCT(X:GGT (SEQ id No. 34). CTTGTACTTGAGTCCTG (SEQ ID 
No. 35), CTCCGGTTGAAATACGGA (SEQ ID No. 36), 
CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37). 
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TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38), and 
ATCTGAATATGGATAAT (SEQ ID No. 39) imo cells. 

23. A method of blodcing synthesis of type S 17P-HSD, comprising the step of: 
S iniFOdiiciiig an oligonucleoiide selected from the group consisting of 



TTCTCGGAACCTGGAGGAGC 


(SEQ 


ID 


No. 


40). 


GACACAGTACC'lTrGAAGTG 


(SEQ 


ID 


No. 


41). 


TGGACCAAAGCTGCAGAGGT 


(SEQ 


ID 


No. 


42). 


CCTCACCTGGCTGAAATAGA 


(SEQ 


ID 


No. 


43). 


AAGCACTCACCTCCCAGGTG 


(SEQ 


ID 


No. 


44). 



GACATTCTACCTGCAGTTGA (SEQ ID No. 45), CTCAAAAACCTATCAGAAA 
(SEQ ID No. 46), GGAAACTTACCTATCACTGT (SEQ ID No. 47), and 
GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48) into cells. 

IS 24. A method of blocking synthesis of type 5 17P-HSD, comprising the step of: 

introducing an oligonucleotide selected from the group consisting of 
GAGAAATATTCATTCTG (SEQ ID No. 49). CGAGTCCTGATAAAGCTG (SEQ 
ID No. SO). GATGAGGGTGCAAATAA (SEQ ED No. 51), 
GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52), 

20 CAGAGATTACAAAAACAAT (SEQ ID No. 53). 

TGCCTTTTTACATTTTCAATCA (SEQ ID No. 54). ACACATAATTTAAAGGA 
(SEQ ID No. 55). TTAAATTATTCAAAAGG (SEQ ID No. 56), 
AAGAGAAATATTCATTTCTG (SEQ ID No. 57). 

CCCCTCCCCCCACCCCTGCA (SEQ ID No. 58). and 

25 CTGCCGTGATAATGC<XC (SEQ ID No. 59) into cells. 

25. An isolated chromosomal DNA fragment which upon transcription and 
translation encodes type 5 17P-hydroxysteroid dehydrogenase and wherein said 
fragment contains nine exons and wherein said fragmeitt includes introns which are 16 

30 kilobase pairs in length. 

26. An isolated DNA sequence encodii^ type 5 17P-hydroxysteroid 
dehydrogeiuse, said sequence beiiig sufTiciently homologous to SEQ ID No. 3 or a 
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10 



15 



20 



25 



conqplemem thereof, to hybridize under stringent conditions to SEQ ID No. 3, or its 
complement. 

27. A method for producing type S I7p-hydroxysteroid dehydrogenase, comfnising 
the steps of: 

preparing a recombinant host transformed or transfected with the vector of 
claim 3; and 

culturi^g said host under conditions which are conducive to the production of 
type 5 17p4iydroxysteroid dehydrogenase by said host. 

28. . A method for determining the inhibitory effect of a test compound on the 
enzjrmatic activity of type 5 17P-hydroxysten>id dehydrogenase, comprising the steps 
of: 

providing type 5 17P-hydroxysteroid dehydrogenase; 

contacting said type 5 17p-hydroxy steroid dehydrogenase with said test 
con^x>und; and thereafter 

determining the enzymatic activity of said type 5 17p-hydroxysteioid 
dehydrogenase in the presence of said test compound. 

29. The method, as recited claim 28, wherein said step of determining em^matic 
activity includes the stq^s of: 

adding a substrate which is metabolized by said type 5 17P-hydroxysteroid 
dehydrogenase; and 

determining an amount of said substrate which is converted to metabolite. 

30. A method of interfering with the expression of type 5 17p-hydroxysteioid 
dehydrogenase, comprising the step of administering nucleic acids substantially 
identical to at least IS consecutive nucleotides of SEQ ID No. 1 or a conq>lemem 
thereof. 

31. A method of interfering with the synthesis of type 5 17p-hydroxy steroid 
dehydrogenase, comprising the step of administering amisense RNA conq)lmiemary 
to mRNA encoded by at least IS consecutive nucleotides of SEQ ID No. 1 or a 



SUBSTITUTE SHEET (RULE 26) 



wo 97/11162 




7CA96Am605 



-49- 



complement thereof. 

32. A method of interfering with the expression of type 5 np-hydroxysteroid 
dehydrogenase, conqiristng the step of administering nucleic acids substantially 

S identical to at least IS consecutive nucleotides of SEQ ID No. 3 or a complement 
thereof. 

33. A method of interfering with the synthesis of type 5 17p-hydroxysteroid 
dehydrogenase, coiiq)nsing the step of administering antisense RNA complementary 

10 to mRNA encoded by at least 15 consecutive nucleotides of SEQ ID No. 3 or a 
conqilement thereof. 

34. A method for detemiining the inhibitory effect of antisense nucleic acids on the 
enzymatic activity of type 5 17p-hydroxysteroid dehydrogenase, comprising the steps 



providing a host system capable of expressing type 5 17P-hydroxysteroid 
dehydrogenase; 

introducing said antisense nucleic acids into said host system: and thereafter 
determining the enzymatic activity of said type 5 ivp-hydroxysteroid 
20 ddiydrogenase. 
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